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finding a template 



Suppose we wish to find a known template T(x,y) in a given image l(x,y). 
This problem is known as template matching. 




FC Barcelona 

image 

the template can be small or large 
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alignment (MRI images) 
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Adapted from Kubic, Unser, 2003 



template matching 




template t(x, y) x,y = 0, B - 1 



image 



I(x,y) x,y = 0, N -1 



displacement t = (u,v) 



The solution is based on two steps: 

• define a matching criterion M (e.g., cross correlation) 

• find local maxima/minima (e.g., exhaustive search) 
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object detection 



matching criterion: M 





nonlinear optimization 

0 

Non-minimum suppression: 

M (tg ) < M (t) for all t in a vicinity of radius r of d 



Thresholding: M(tQ)<X 



X threshold 
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matching criteria 



B-l 

cross-correlation R(u,v)= X T(x,y)I(x + u,y + v) 

x,y=0 



sum of square differences (SSD) 
(l 2 norm, squared) 



sum of absolute differences (SAD) 
{\ A norm) 





B-l 


E(u,v) 


= Z [r(x,);)-/(x + M ,j + v)] 2 




x,y=0 






B-l 


E(u,v) 


= I ir(x,y)-/(jc+«,y + v)i 







Non integer displacements can be considered. Image interpolation is required in this case. 
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cross-correlation 



SSD 



SAD 
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There are better ways to detect faces (e.g., Viola & Jones) !! 



limitations 




Template matching has weaknesses: 
• not invariant to rotations and scaling 



more general transformations 



not invariant to illumination changes 
time consuming 



modify matching criteria to 
improve robustness 



template adaptation is tricky 
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problem formulation 




Matlab 



Image alignment 

Given 2 (or more) images I, T we wish to estimate 
a transformation which maps the first into the 
second 

(*,?)-> (x\y') = W(x,y;0) 

according to some criterion. 



This can be done using: 

feature based methods: 

based on the alignment of feature points (marks) 

image based methods: 

based on the alignment of image intensity or color 
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What geometric transformations can we use? 
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translation & rigid body 



translation 





r 






rigid body 













rotation matrix 
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W(x;6) = x + t e = t 



2 degrees of freedom 



W(x;9) = /fcc + f 



9 = (R,t) 



3 degrees of freedom 



RR 1 = R 1 R = I 

det(R) = 1 



R = 



cos# -sin# 
sin 6 cos 6 



afinne and projective transformations 



affine transformation 





W(x;6) = Ax + t 0 = (A,t) 



6 degrees of freedom 



projective transformation (homography) 



r s 




W(x,6) = 



Pix+p 2 y+P3 
Pi*+Psy+P9 

P4X+p 5 y+p 6 
Pl*+Psy+P9 



Q=(p h ...,p 9 ) 



8 degrees of freedom 
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projective and 



projective (contd.) 



~T ~T 

Y ,_ x Pi r ^_ x P2 

~T y ~ ~T 
x p 3 x p 3 



polynomial 





others .... e.g., free form deformations 
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transformations 



V\ = [P\P2P?>] T P2 = lP4 P5P6J T 
P3 =[PlP8P9J T % = [xyl] T 



W(x,9) = 



Z a Pq xP y q 



p,q\p+q<n 



p,q:p+q<n 



The estimation of coefficients is 
numerically ill conditioned 



properties 





DoF 


Preserves 
lines? 


Preserves 
Paralelism? 
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Angles? 


Preserves 
length? 
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X 
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X 


X 


X 
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can we align images using intensity? 
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image based methods 



Problem: 

Given two images T, I we wish to find a geometric transformation W(x) which 
maps points of the first image into points of the second, such that 
l(W(x))«T(x). 

> 

\ Most popular criterion (SSD) 

color constancy 

= ! [T(x)-I(W(x;0))] 2 

x 

Note: the sum is for all the points x in which both images T(x), l(W(x)) overlap. 
The minimization of E is a non linear problem!! 
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Lucas-Kanade (translation motion) 



Criterion E(u,v) = ^ [7\x)-/(x + t)]' 

X 



Parameter update t = tg + At 

First order approximation of the image /(x + 1) = /(x + + y/( x + t 0 ) r At 



Lucas Kanade algorithm (recursion) 



nil 






I 2 

y 



y 



At = 



S(r(x)-/(x + t 0 ))/ x 

X(T(x)-I(x + to))Iy 



RAt=r 

t <r- tQ + At 



t <r~ tQ + 
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l x , l y are the partial derivatives of I at x+t 0 . 



convergence from several starting points 




The SSD criterion is not explicitly computed in the L-K algorithm. 
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proof 



Let us minimize 



£ = Z [r(x)-/(x + t 0 )-V/(x + t 0 ) r Ar] 2 



A necessary condition is 



= 0 E [r(x)-/(x + t 0 )-V/(x + t 0 ) r At]V/(x + to) = 0 

dAt 



I V/(x + t 0 )V/(x + t 0 r At = I [T(x) - /(x + t 0 )]V/(x + 1 0 ) 

X X 



Defining 



V/(x + t 0 ) = 



7 x (x + t 0 )' 
/j(x + t 0 ) 



We obtain 



XV* x/ 



2 



At = 



'I(r(x)-/(x + t 0 ))/ x 

S(r(x)-/(x + t 0 ))/ y 
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discussion 



L-K strong points 

• uses all the available information 

• It is simple 

• appropriate for tracking 

• can be extended to deal with general motion models 

L-K weak points 

• no guarantee that the optimal solution is obtained 

• the solution depends on the initialization — ► use multiple scales 

• convergence is difficult if the number of parameters is high 

• solution depends on the illumination illumination can be estimated 
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can we align images from sparse prototypes? 
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feature based matching 



Problem: 

Given two sets of points {xj, {x'j} detected in the images T, I, we wish to find 
a geometric transformation W that maps the points {x,} into the points {x'j}. 




we assume that the correspondence is known x, <-> x' z - 
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approach 



Define a matching criterion e.g., 

£(9) = Z llx'i-WCxi^)!! 2 SSD criterion 



Minimize the criterion with respect to 0 using a closed form or a 
numeric algorithm. 



Note: there are other matching e.g., ^norm. 
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example 



input 















































































































































































output 




alignment using a projective transform 



estimation of an homography 



Homography 



x'=f(x,p) 



x = 



y - 



_ y\x + p 2 y + P2 > 

vix+p%y + P9 

y A x + p 5 y + p 6 
Vjx + p^y + pg 



II p\\=l 



is a nonlinear function of the unknown parameters. 
The minimization of the SSD criterion is difficult !! 



£(p) = Z iix'i-f(xi,p)ir 



Idea: use another (simpler) criterion instead 

(p 7 x + p%y + p 9 )x' = fax + p 2 y + P3) 
(p 7 x + p^y + pg)y ' = (P4* + psy + /?6) 




e = 



(p 4 x + + /? 6 ) - (p 7 x + p^y + /?9)y ' 



algebraic error 




II/? 11=1 
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estimation of the projective transform (2) 



minimize 

£«=p T M T Mp 
with restriction p T p = 1 

This problem can be easily solved using Lagrange multipliers: 
p is the eigenvector of matrix M T M associated to the smallest eigenvalue. 

The whole algorithm can be written in 1 (long) line of Matlab! 
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X! yi 1 0 0 0 

x n y n 1 0 0 0 

0 0 0 x x yi 1 

0 0 0 x n y n 1 



- x\ x\ — x\ y\ - x\ 



x n x n x n y n x n 

- y\ x \ - y\ y\ - y\ 



-y'n x n -y n y n -y n 



proof 



Lagrangian function 

L = E-X{p T p - 1) = p T M T Mp -X{p T p-Vi 

dL =0 => M T Mp-Ap = 0 



dp 



M T Mp = Xp 



p is na eigen vector of matrix M T M 



choose ^ min 



which one? E = p T M T Mp = Xp T p = X 
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other transformations? 



The other transformations (translation, affine, polynomial) are easily 
estimated by the minimization of the SSD criterion E. 



Only the rigid body transformation is a bit more difficult because matrix R 
is not free. It is a rotation matrix: R T R=RR T =I and the SSD criterion must 
be opyimized under this restriction. 

This problem can be solved using the singular vector decomposition of 
the data. 
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unknown correspondence 





This is a difficult problem! 



We need to estimate a permutation matrix. 




p = 



o o o" 



l o o 



o 1 o 

which minimizes the matching criterion E. 



tough! 

See the paper by Maciel & Costeira, PAMI03 



suboptimal approaches are used instead! 
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ransac 



RANSAC stands for Random Sample Consensus (Fischler, Bolles, 1981 ) 



It is based on hypothesis generation and classification of data points as inliers and 
outliers. 





estimate translation 




only 2 points are matched! 
bad attempt! 
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ransac (2) 



Objective: to estimate a transform W(x,q) with 2n degrees of freedom. 



Algorithm 

Hypotheses generation 

randomly select n pairs of points (x i5 x' k ) 
estimate the geometric transformation W(x,8) 

Compute the number of points which were correctly aligned (support) i.e., such 
that 

lx'£-W(x;,0) |<e 

Model selection: choose the transformation with largest support 

Refinement: improve the estimate of 9 by applying the least squares method to the 
subset of points which are well aligned. 
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example - registration 




example - mosaicing 




homography 
(4 marks) 
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exemplo (cont.) 




mosaicing 
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3D ultrasound 



without alignment 




with alignment 
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non-rigid alignment 
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Kybic, Unser, 2003 



region tracking 
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Two steps region detection 

region tracking 
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Region detection 



problem 



goal: 

• detect all moving objects 

assumptions 

• static camera 

• static background 

• show illumination changes 
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Evolution of pixel color 




Background subtraction 





background image 



foreground 
background 



Pixel classification 

If |l(x,y)-B(x,y)| < £ , the pixel is classified as 
background pixel. Otherwise it is classified as 
active. 
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Basic background subtraction 



The basic background subtraction classifies a pixel l(x,y) as active if 

A(x,y) = l if I I(x,y)- B(x,y) \> I 
A(x, y) = 0 otherwise 

Image A(x,y) is very noisy. It has many small regions classified as active and some 
true objects appear fragmented in several regions. 

Morphologinal post-processing is usually done. Typically we compute all conected 
components and eliminate all the small regions. 
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Example 
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How to deal with time-varying illumination? 



Illumination changes can be compensated by the adaptation of the background 
image. 

Only the pixels belonging to the background regiion should be adapted. 



B(x, y, t) = aB(x, y, t - 1) + (1 - a)I(x, y,t) background pixels 
B(x, y, t) = B(x, y,t-V) foreground pixels 
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Gaussian background model 



(see Wren et al., 1997) 



Cackground pixels are corrupted by noise, 
variable with Gaussian distribution 



/(x, y) ~ N(ju(x,y),R(x,y)) 




R G 



We can model each pixel as a random 
pixel classification 

p(I(x, y)) > X => background pixel 
p(I(x, y)) < X => foreground pixel 



, T , NN I -kl(x,y)-fi(x,y)) T R \l(x,y)-fi(x,y)) 

p(I(X,y))= ^ ryy e 2 

{In) 51 1 &el(Ry' 1 
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Estimation of the Gaussian model 



batch 

1 T 

ju(x,y) = j;Y I(x,y,t) 
t=\ 

T 

r=l 



adaptive 



ju(x,y,t) 


= aju(x,y,t- 


-1) + (1- 


-a)/(*,;y,0 


R(x,y,t) 


= aR(x,y,t 


-1) + (1 


- a)(/(x, y, 0 - ju(x, y, t - l))(I(x, y,t) - ju(x, y, t - 1)) 
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Only background pixels should be used 



region tracking 
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region tracking 





false alarm 




occlusion 



new track 



Goal: find the trajectory of each object along multiple frames 



Dificulties: misdetections, false alarms, occlusions, object splits and 
merges, new tracks 
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point tracking 






m 



t+i 



Data D = {(t,Pj )} pf position of the i-th region at frame t 

Track is a sequence of points detected at different (usually consecutive) frames 

T = {(t h x l ),(t 2 ,x 2 )...(t n ,x n )} (tj,Xj)e D, tj<t i+l (t !+ i=t,+i) 
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point association 



— 




t-1 



t+1 



available methods: 

Statistical: propagate uncertainty and assume a dynamic model for the target 
trajectories (e.g., Kalman or PDA filter) 

Deterministic: based on assignment costs and do not require dynamic models 
(e.g., graph based methods) 



Jorge Marques, 2008 



hypotheses 




typical assumptions 

(a) only regions detected in consecutive frames can be associated 

(b) regions should correspond to a single target (and vice-versa) 

(c) new objects may appear (track birth) 

(d) objects can disapear or be occluded (track death) 

(b') objects can overlap and form groups 
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statistical methods 



Statistical methods assume we know a set of tracks and wish to extend 
them in new frames. 



Involve 3 steps: 

prediction 
data association 
update 



observations 
frame t+1 



o 



o 



prediction 
current estimate frame t+1 
frame t 



Difficulties: 

data association problem 
initialization of new tracks 
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Methods: 

nearest-neighbor Kalman filter 

probabilistic data association filter 

joint probabilistic data association filter 

particle filter 



methods based on graphs 




1 2 3 



Nodes coorespond to the detected objects in each frame and the links define a 
solution for the association problem 

Each admissible link has a cost C t (ij) (unconnected nodes also have a cost). 
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Veenman et al 



(PAMI 2001) 




This method deals with pairs of frames and formulates the association of 
targets to existing tracks as an assignment problem if M=m. 



Jorge Marques, 2008 



assignment problem 



Problem: there are M agents and m tasks (M=m); we wish to 
assign one agent to one task minimizing the total cost 

m 

0= Z anon 




/,/=! 



Restrictions m m 

Ea ff = Z ajj=l a ff e{0,1} 

1=1 y=l 



Cy is the cost of assigning agent i to task j and ay is a binary variable wich is equal 
to 1 if and only if agent i is assigned to task j. 



The minimization of C under these restrictions is a linear programming problem for 
which there are very efficient algorithms e.g., Hungarian method. 
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Example 




agents 



total cost: 0+2+2=4 
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tasks 



A = 



0 0 1 

1 0 0 
0 1 0 



cost matrix 



In tracking, the association cost can be defined in different ways. Two popular 
choices are 



distance criterion a,y =|| p? 1 - pj- 



prediction error a,y =|| p] 1 + v ] 1 - Py 



vj 1 displacement vector computed from a previous 
assignment, (cannot be used in track initialization) 
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Birth and death of tracks 



The previous method does not account for new tracks but it has been extended to allow 
birth and death of tracks 

Consider a problem in which all the targets are new. In this case, all the M tracks should 
die are all the m targets correspond to new tracks. 

How can we do this in the previous framework? 



solution: add M virtual targets and m virtual tracks 



m 



M 




Cjj = c^jgh if i > M or j > m 
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Example 1d 



• § 




frame 



costs were computed using the prediction error, except at the beginning of each track. 
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